Facial Action Unit Recognition from Video Streams with Recurrent Neural Networks
نویسنده
چکیده
Facial expressions are one of the parameters for accessing individual behavioral processes. Their recognition and verification can be framed as the identification of states of dynamical systems generated by physiological processes. Whereas a snap shot of a dynamical system gives information about its current state, a time series of past states captures its trajectory in state space. The description and recognition of facial expressions using atomic muscle movements, so-called action units provide an extensive framework. The temporal modeling and recognition of these muscle movements promises a broader and more generic approach for recognizing subtle changes on the facial region. This paper proposes the use of recurrent neural networks for modeling facial action unit activity. Recurrent neural networks are able to model actions based on their previous and current states, unlike other dynamic classifiers such as hidden Markov models. A detailed comparative analysis with the recognition performance of a static classifier such as support vector machines suggests that recurrent neural networks gain more knowledge about the action unit activation when presented with a sequence of images. On average our model achieved a positive hit rate of 85.8% for upper face action units and 84.9% for lower face action units.
منابع مشابه
Hand Gesture Recognition from RGB-D Data using 2D and 3D Convolutional Neural Networks: a comparative study
Despite considerable enhances in recognizing hand gestures from still images, there are still many challenges in the classification of hand gestures in videos. The latter comes with more challenges, including higher computational complexity and arduous task of representing temporal features. Hand movement dynamics, represented by temporal features, have to be extracted by analyzing the total fr...
متن کاملThe University of Passau Open Emotion Recognition System for the Multimodal Emotion Challenge
This paper presents the University of Passau’s approaches for the Multimodal Emotion Recognition Challenge 2016. For audio signals, we exploit Bag-of-Audio-Words techniques combining Extreme Learning Machines and Hierarchical Extreme Learning Machines. For video signals, we use not only the information from the cropped face of a video frame, but also the broader contextual information from the ...
متن کاملMulti-Level ResNets with Stacked SRUs for Action Recognition
Most existing Convolutional Neural Networks(CNNs) used for action recognition are either difficult to optimize or underuse crucial temporal information. Inspired by the fact that the recurrent model consistently makes breakthroughs in the task related to sequence, we propose a novel Multi-Level Recurrent Residual Networks(MRRN) which incorporates three recognition streams. Each stream consists ...
متن کاملDeep Convolutional Neural Networks for Smile Recognition
This thesis describes the design and implementation of a smile detector based on deep convolutional neural networks. It starts with a summary of neural networks, the difficulties of training them and new training methods, such as Restricted Boltzmann Machines or autoencoders. It then provides a literature review of convolutional neural networks and recurrent neural networks. In order to select ...
متن کاملAdaptive Detrending to Accelerate Convolutional Gated Recurrent Unit Training for Contextual Video Recognition
Based on the progress of image recognition, video recognition has been extensively studied recently. However, most of the existing methods are focused on short-term but not long-term video recognition, called contextual video recognition. To address contextual video recognition, we use convolutional recurrent neural networks (ConvRNNs) having a rich spatiotemporal information processing capabil...
متن کامل